Linkage and association analyses of principal components in expression data
نویسندگان
چکیده
Performing linkage and association analyses on a large set of correlated data presents an interesting set of problems. In the current setting, we have 3554 expression levels from lymphoblastoid cell lines in 194 individuals from 14 three-generation Utah CEPH (Centre d'Etude du Polymorphisme Humain) pedigrees. We formed multivariate expression phenotypes from six sets of genes. These consisted of a set of genes identified by the data providers as showing common linkage to a region of chromosome 14, as well as five other sets suggested by ontological evidence. Using principal-component analyses, we generated seven quantitative phenotypes for expression levels from these six sets of genes. We performed quantitative genome linkage screens on these traits using the expression traits from the third generation of each pedigree. As expected, the strongest linkage signal was achieved when the trait under analysis was the composite of the expressions of genes previously showing linkage to chromosome 14. In particular, this trait produced a LOD score of 5.2 on chromosome 14. The trait also produced LOD scores over 3.5 on chromosomes 1, 7, 9, and 11; this suggests that these genes may be controlled by additional genetic factors on the genome. Subsequent association analyses on the first two generations of these pedigrees identified two polymorphisms on chromosome 11 as significant after correcting for multiple tests. These results suggest that principal-component analyses are useful for the analysis of pleiotropic loci. Furthermore, we have identified two single-nucleotide polymorphisms that may influence the expression of multiple genes linked to chromosome 14.
منابع مشابه
Choosing the Best Hierarchical Clustering Technique Based on Principal Components Analysis for Suspended Sediment Load Estimation
1- INTRODUCTION The assessment of watershed sediment load is necessary for controling soil erosion and reducing the potential of sediment production. Different estimates of sediment amounts along with the lack of long-term measurements limits the accessibility to reliable data series of erosion rate and sediment yield. Therefore, the observed data of suspended sediment load could be used to ...
متن کاملGenetic Diversity of Genotypes of Durum Wheat (Triticum Turgidum L.) Genotypes Based on Cluster and Principal Component Analyses
Genetic diversity is the basis of the natural evolution of plant breeding and biological system are important components of sustainability. The aim of this study was to evaluate 116 genotypes of Triticum turgidum from seven countries in terms of morphological traits. The results showed that high significant differences among the genotypes. The correlation between gra...
متن کاملLinkage analysis using principal components of gene expression data
The goal of this paper is to investigate the effect of using principal components as a data reduction method for expression data in linkage analysis. We used 45 probes normalized using the Affymetrix Global Scaling that had evidence of high heritability to estimate the first 10 principal components (PC). A genome-wide linkage scan was performed on the 45 expression values and the 10 PCs using 2...
متن کاملGenetic Diversity of Genotypes of Durum Wheat (Triticum Turgidum L.) Genotypes Based on Cluster and Principal Component Analyses
Genetic diversity is the basis of the natural evolution of plant breeding and biological system are important components of sustainability. The aim of this study was to evaluate 116 genotypes of Triticum turgidum from seven countries in terms of morphological traits. The results showed that high significant differences among the genotypes. The correlation between gra...
متن کاملAssociation analyses of the MAS-QTL data set using grammar, principal components and Bayesian network methodologies
BACKGROUND It has been shown that if genetic relationships among individuals are not taken into account for genome wide association studies, this may lead to false positives. To address this problem, we used Genome-wide Rapid Association using Mixed Model and Regression and principal component stratification analyses. To account for linkage disequilibrium among the significant markers, principa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- BMC Proceedings
دوره 1 شماره
صفحات -
تاریخ انتشار 2007